[magpietts] added multiple validation dataloaders and log metrics per val data by XuesongYang · Pull Request #15348 · NVIDIA-NeMo/NeMo

XuesongYang · 2026-02-02T21:16:50Z

Summary

Cherry picked added multiple validation dataloaders and log metrics per val data. #15189 to the main branch.
made extra refactors for logging media artifacts.

Details

Added logging support when running validation on multiple datasets. the training configuration looks like below,

python examples/tts/magpietts.py \\
    ...
    model.train_ds.input_cfg="/data/magpie_pretraining_data/manifests/train_input_cfg_audioCodec21fpsCausalDecoder_en.yaml" \\
    ...
    model.validation_ds.datasets="/xueyang_data/val_input_cfg_audioCodec21fpsCausalDecoder_en.yaml" \\
    ...

where, the yaml config for validation datasets looks like below, which is apt to generalize to multiple languages datasets.

$ cat /xueyang_data/val_input_cfg_audioCodec21fpsCausalDecoder_en.yaml
- name: "LibriTTS_dev_clean"
  input_cfg: "/data/magpie_pretraining_data/manifests/val_input_cfg_audioCodec21fpsCausalDecoder_en.yaml"
- name: "LibriTTS_test_clean"
  input_cfg: "/xueyang_data/val_input_cfg_audioCodec21fpsCausalDecoder_en_LibriTTS_testClean.yaml"

example plots from wandb:

wandb log see here: https://wandb.ai/aiapps/debug_magpieTTS_EN_2509/runs/bqerks4y?nw=nwuserxuesong_yang
The model yaml config that were previously under train_ds.dataset are now directly under 'train_ds'.
The configuration structure for validation_ds changes:
- removing the 'dataset' nesting level.
- requiring a 'datasets' list structure, even for a single validation dataset

@XuesongYang

…VIDIA-NeMo#15189) * added multiple validation dataloaders and log metrics per val data. * Apply suggestion from @XuesongYang Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Apply suggestion from @Copilot Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Apply suggestion from @Copilot Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Apply suggestion from @Copilot Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> --------- Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…ation to on_validation_epoch_end. Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Copilot

Pull request overview

Adds support for validating MagpieTTS on multiple datasets (multiple validation dataloaders) while improving how media artifacts (audio + attention visualizations) are prepared and logged to W&B/TensorBoard, and updates the example Lhotse config to the new dataset configuration structure.

Changes:

Refactors validation media logging by separating data preparation (numpy arrays) from logger-specific emission (W&B/TB objects).
Adds multi-dataloader validation support, including per-dataloader metric aggregation and an averaged validation loss for checkpointing.
Updates the MagpieTTS Lhotse example config to remove the dataset: nesting and introduce a validation_ds.datasets list format.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
nemo/collections/tts/models/magpietts.py	Implements multi-validation-dataloader handling, refactors media logging, and adjusts Lhotse dataloader config expectations.
examples/tts/conf/magpietts/magpietts_lhotse.yaml	Updates example configuration to match the new train/validation dataset config structure and multi-val datasets list format.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

nemo/collections/tts/models/magpietts.py

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

…exists in val ds config Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

nemo/collections/tts/models/magpietts.py

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

subhankar-ghosh · 2026-02-03T22:18:55Z

nemo/collections/tts/models/magpietts.py

+
+        return log_dict
+
    def on_validation_epoch_end(self):


This method is too long. I would suggest creating two separate methods like:

if is_multi_dataloader: self._process_multi_dataloader_validation() else: self._process_single_dataloader_validation()

Instead of maintaining 4 lists
all_losses, all_codebook_losses, all_alignment_losses, all_aligner_encoder_losses, all_local_transformer_losses switch to a dict. Keeps it clean, something like:

for metric_name, metric_value in dataloader_logs.items(): self.log(f"Loss:{dataloader_prefix}/{metric_name}", metric_value, prog_bar=False, sync_dist=True) aggregated_metrics.setdefault(metric_name, []).append(metric_value)

Using a dict instead of 4 separate lists: This is a good suggestion! will make changes.

Splitting into two methods: Reasonable for readability, though the method is currently ~115 lines which is manageable. The split would reduce cognitive load. After refactoring by item 1, 115 lines are reduced to 50 lines.

subhankar-ghosh · 2026-02-03T22:31:49Z

nemo/collections/tts/models/magpietts.py

+            )
+
+        # Compute required metrics
+        val_loss = collect_required_metric(outputs, 'val_loss')


Just a suggestion. I see a the metrics being hard coded everywhere and loosely used. Instead we should define them in a constant Enum.

I may not follow what are the "hard coded" metrics in this context. Could you pls elaborate?

a second thought on your suggestion, maybe we can get something like below,

from enum import Enum class ValidationMetric(Enum): """Validation metric keys used in validation_step outputs and logging.""" LOSS = 'loss' CODEBOOK_LOSS = 'codebook_loss' ALIGNMENT_LOSS = 'alignment_loss' ALIGNER_ENCODER_LOSS = 'aligner_encoder_loss' LOCAL_TRANSFORMER_LOSS = 'local_transformer_loss' # Usage: val_loss = collect_required_metric(outputs, f'val_{ValidationMetric.LOSS.value}') log_dict = { ValidationMetric.LOSS.value: val_loss, ValidationMetric.CODEBOOK_LOSS.value: val_codebook_loss, }

If my understanding aligns with yours, I would leave it for future cleanup because we are not adding new metrics frequently. It is a good suggestion, and we can track as a follow-up imrpovement.

Sure, let's take it up for future cleanup.

subhankar-ghosh · 2026-02-03T22:34:21Z

nemo/collections/tts/models/magpietts.py

+        val_codebook_loss = collect_required_metric(outputs, 'val_codebook_loss')
+
+        # Compute optional metrics
+        val_alignment_loss = collect_optional_metric(outputs, 'val_alignment_loss')


Can something like this be done?

for metric in VAL_METRIC_LIST: log_dict[metric] = collect_optional_metric(outputs, metric)

We need to check for Nones.

Nones are already managed in validation_step. No need to check again here. Besides, collect_optional_metric would return None if any metric does not exist in outputs.

Can something like this be done?

for metric in VAL_METRIC_LIST: log_dict[metric] = collect_optional_metric(outputs, metric)

We need to check for Nones.

we only have two required losses, so i believe no need to wrap up as a for loop. Similar to optional losses, we only have 3. So I would leave it if we want to access many more losses in the future.

I meant do something like the following. So if we in the future need to add any additional loss we do not need to touch this code, also makes it much cleaner.

VAL_METRIC_LIST = ['val_loss', 'val_codebook_loss', 'val_alignment_loss', ...] # This is where a global list/Enum of metrics would help, so we can just iterate over them. # But for now a local list would be fine. for metric in VAL_METRIC_LIST: metric_value = collect_optional_metric(outputs, metric) if metric_value is not None: log_dict[metric] = metric_value

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

XuesongYang · 2026-02-04T19:32:20Z

@subhankar-ghosh @blisc I addressed all your suggestions. Please have a look.

blisc · 2026-02-06T18:26:37Z

nemo/collections/tts/models/magpietts.py

+                'context_audios': context_audios,
+            }
+
+    def _log_media_to_wandb_and_tb(self, media_data: ValidationMediaData, global_step: int) -> None:


I am not a fan of passing dataclasses to functions especially if you do not re-use them. It hides the underlying datatypes of the class and it's not intuitive from the docstring what you are supposed to pass to this function. Is there an argument for why you want to do it this way?

XuesongYang added the Run CICD label Feb 2, 2026

XuesongYang force-pushed the xueyang/pr-multi-val-dataloaders-main branch from 861e8b3 to fae5fcb Compare February 2, 2026 22:08

XuesongYang marked this pull request as ready for review February 3, 2026 00:31

XuesongYang added TTS Run CICD and removed Run CICD labels Feb 3, 2026

XuesongYang had a problem deploying to test February 3, 2026 00:35 — with GitHub Actions Error

XuesongYang and others added 3 commits February 2, 2026 17:35

refactor wandb and tb logging. Move image and audio objects initializ…

d1dab37

…ation to on_validation_epoch_end. Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

bugfix: adpat new changes on codec inference and process_batch

c9cc855

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

XuesongYang force-pushed the xueyang/pr-multi-val-dataloaders-main branch from fae5fcb to c9cc855 Compare February 3, 2026 01:35

Copilot AI review requested due to automatic review settings February 3, 2026 01:35

chtruong814 added Run CICD and removed Run CICD labels Feb 3, 2026

Copilot started reviewing on behalf of XuesongYang February 3, 2026 01:36 View session

chtruong814 had a problem deploying to test February 3, 2026 01:37 — with GitHub Actions Error

Copilot AI reviewed Feb 3, 2026

View reviewed changes

update docstring for on_validation_epoch_end

c8f7edc

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

chtruong814 added Run CICD and removed Run CICD labels Feb 3, 2026

chtruong814 temporarily deployed to test February 3, 2026 02:26 — with GitHub Actions Inactive

XuesongYang requested review from blisc, paarthneekhara, rlangman and subhankar-ghosh February 3, 2026 02:27

backward compatibility for non-lhotse configs when no 'datasets' key …

275df36

…exists in val ds config Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

chtruong814 added Run CICD and removed Run CICD labels Feb 3, 2026

chtruong814 temporarily deployed to test February 3, 2026 07:09 — with GitHub Actions Inactive

XuesongYang enabled auto-merge (squash) February 3, 2026 17:45

blisc requested changes Feb 3, 2026

View reviewed changes

nemo/collections/tts/models/magpietts.py Outdated Show resolved Hide resolved

nemo/collections/tts/models/magpietts.py Show resolved Hide resolved

nemo/collections/tts/models/magpietts.py Show resolved Hide resolved

XuesongYang added 2 commits February 3, 2026 12:02

refactor arguments of func _log_media_to_wandb_and_tb with dataclass.

f948b82

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

added unit tests on legacy and new yaml configs.

56f905c

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

chtruong814 added Run CICD and removed Run CICD labels Feb 3, 2026

chtruong814 temporarily deployed to test February 3, 2026 22:01 — with GitHub Actions Inactive

XuesongYang requested a review from blisc February 3, 2026 22:01

subhankar-ghosh requested changes Feb 3, 2026

View reviewed changes

XuesongYang disabled auto-merge February 4, 2026 00:37

refactor as suggested using a dict instead of 5 lists to store losses.

67febd2

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

chtruong814 added Run CICD and removed Run CICD labels Feb 4, 2026

XuesongYang requested a review from subhankar-ghosh February 4, 2026 01:36

chtruong814 temporarily deployed to test February 4, 2026 01:36 — with GitHub Actions Inactive

blisc reviewed Feb 6, 2026

View reviewed changes

Conversation

XuesongYang commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

XuesongYang Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

subhankar-ghosh Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

XuesongYang commented Feb 4, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

XuesongYang commented Feb 2, 2026 •

edited

Loading

XuesongYang Feb 4, 2026 •

edited

Loading

subhankar-ghosh Feb 5, 2026 •

edited

Loading